Prediction of Poly(A) Sites by Poly(A) Read Mapping
نویسندگان
چکیده
RNA-seq reads containing part of the poly(A) tail of transcripts (denoted as poly(A) reads) provide the most direct evidence for the position of poly(A) sites in the genome. However, due to reduced coverage of poly(A) tails by reads, poly(A) reads are not routinely identified during RNA-seq mapping. Nevertheless, recent studies for several herpesviruses successfully employed mapping of poly(A) reads to identify herpesvirus poly(A) sites using different strategies and customized programs. To more easily allow such analyses without requiring additional programs, we integrated poly(A) read mapping and prediction of poly(A) sites into our RNA-seq mapping program ContextMap 2. The implemented approach essentially generalizes previously used poly(A) read mapping approaches and combines them with the context-based approach of ContextMap 2 to take into account information provided by other reads aligned to the same location. Poly(A) read mapping using ContextMap 2 was evaluated on real-life data from the ENCODE project and compared against a competing approach based on transcriptome assembly (KLEAT). This showed high positive predictive value for our approach, evidenced also by the presence of poly(A) signals, and considerably lower runtime than KLEAT. Although sensitivity is low for both methods, we show that this is in part due to a high extent of spurious results in the gold standard set derived from RNA-PET data. Sensitivity improves for poly(A) sites of known transcripts or determined with a more specific poly(A) sequencing protocol and increases with read coverage on transcript ends. Finally, we illustrate the usefulness of the approach in a high read coverage scenario by a re-analysis of published data for herpes simplex virus 1. Thus, with current trends towards increasing sequencing depth and read length, poly(A) read mapping will prove to be increasingly useful and can now be performed automatically during RNA-seq mapping with ContextMap 2.
منابع مشابه
Prediction of mRNA polyadenylation sites by support vector machine
mRNA polyadenylation is responsible for the 3' end formation of most mRNAs in eukaryotic cells and is linked to termination of transcription. Prediction of mRNA polyadenylation sites [poly(A) sites] can help identify genes, define gene boundaries, and elucidate regulatory mechanisms. Current methods for poly(A) site prediction achieve moderate sensitivity and specificity. Here, we present a met...
متن کاملPermeability and selectivity prediction of poly (4-methyl 1-pentane) membrane modified by nanoparticles in gas separation through artificial intelligent systems
In this work, the effects of operative parameters on CH4, CO2, O2, and N2 membrane gas separation for poly (4-methyl-1-pentane) (PMP) membrane modified by adding nanoparticles of TiO2, ZnO, and Al2O3 are assessed and investigated. The operative parameters were type and percentage of nanoparticles, and cross membrane pr...
متن کاملCharacterization and prediction of mRNA alternative polyadenylation sites in rice genes.
Polyadenylation [poly(A)] of mRNA is a critical step during gene expression, which plays an important role in the termination of transcription. Prediction of poly(A) sites can help identify 3' ends of genes and improve genome annotation. Due to the limited knowledge of poly(A) signals in plants, predictive modeling of poly(A) sites in agricultural crops remains challenging. Recent studies have ...
متن کاملAn efficient method for genome-wide polyadenylation site mapping and RNA quantification
The use of alternative poly(A) sites is common and affects the post-transcriptional fate of mRNA, including its stability, subcellular localization and translation. Here, we present a method to identify poly(A) sites in a genome-wide and strand-specific manner. This method, termed 3'T-fill, initially fills in the poly(A) stretch with unlabeled dTTPs, allowing sequencing to start directly after ...
متن کاملرسانش الکترونی رشته ( poly(dG)-poly(dCمولکول DNA در ساختار SWNT/DNA/SWNT
In this work, using a tight-binding Hamiltonian model, a generalized Greens function method and Löwdins partitioning techniques, some of the significant properties of the conductance of poly(dG)-poly(dC) DNA molecule in SWNT/DNA/SWNT structure are numerically investigated. In Fishbone model, we consider DNA as a planar molecule which contains M cells and 3 further sites (one base pair site an...
متن کامل